Mapping Lexical Entries in
نویسنده
چکیده
This paper describes automatic techniques for mapping 9611 entries in a database of English verbs to WordNet senses. The verbs were initially grouped into 491 classes based on syntactic categories. Mapping these classiied verbs into WordNet senses provides a resource that may be used for disambiguation in multilingual applications such as machine translation and cross-language information retrieval. Our techniques make use of (1) a training set of 1791 dis-ambiguated entries, representing 1442 verb entries from 167 of the categories; (2) word sense probabilities based on frequency counts in a previously tagged corpus; (3) semantic similarity of WordNet senses for verbs within the same class; (4) probabilistic correlations between WordNet data and attributes of the verb classes. The best results achieved 72% precision and 58% recall , versus a lower bound of 62% precision and 38% recall for assigning the most frequently occurring WordNet sense, and an upper bound of 87% precision and 75% recall for human judgment .
منابع مشابه
Phonological underspecification and mapping mechanisms in the speech recognition lexicon.
The problem of recognizing phonological variations in the speech input has triggered numerous treatments in speech processing models. Two areas of current controversy are the possibility of phonological underspecification in the mental lexicon and the nature of the mapping mechanism from the speech signal to the abstract lexical entry. We present data from cross-modal repetition priming experim...
متن کاملMarkedness and Agreement
This paper presents an account of the interpretation of unmarked verb forms in which the entries of unmarked forms are uniformly unspecified for agreement features. The entries of impersonal verbs directly sanction agreement-neutral syntagmatic structures. However, the entries of unmarked personal verbs sanction structures with negative agreement values, as a consequence of an inflectional bloc...
متن کاملIndowordnets help in Indian Language Machine Translation
Being less resource languages, Indian-Indian and English-Indian language MT system developments faces the difficulty to translate various lexical phenomena. In this paper, we present our work on a comparative study of 440 phrase-based statistical trained models for 110 language pairs across 11 Indian languages. We have developed 110 baseline Statistical Machine Translation systems. Then we have...
متن کاملConstructions License Verb Frames
Where does a verb’s frame come from? The obvious answer is the verb itself, and this is the answer that syntacticians have traditionally provided, whether they describe predicator-argument relations as syntactic sisterhood relations or as lexical properties (the predicator’s combinatoric potential, or valence). Thus, Haegeman, in her introduction to Government and Binding theory, states, “the t...
متن کاملCoupling of WordNet entries for ontology mapping using virtual documents
Facilitating information exchange is a crucial service for ontology-based knowledge systems. This can be achieved by the mapping of two heterogenous ontologies. Many mapping frameworks utilize language-based knowledge resources such as WordNet. By coupling all ontology concepts to a corresponding entry in WordNet, one can quantify the lexical relatedness of any two ontology concepts. However, c...
متن کاملAutomatic Construction of an English-Chinese Bilingual FrameNet
We propose a method of automatically constructing an English-Chinese bilingual FrameNet where the English FrameNet lexical entries are linked to the appropriate Chinese word senses. This resource can be used in machine translation and cross-lingual IR systems. We coerce the English FrameNet into Chinese using a bilingual lexicon, frame context in FrameNet and taxonomy structure in HowNet. Our a...
متن کامل